Integrating Heterogeneous omics Data via Statistical Inference and Learning Techniques
نویسندگان
چکیده
Multi-omics studies are believed to provide a more comprehensive picture of a complex biological system than traditional studies with one omics data source. However, from a statistical point of view data integration implies non-trivial challenges. In this review, we highlight recent statistical inference and learning techniques that have been devised in this context. In the first part of our article, we focus on techniques to identify a relevant biological sub-system based on combined omics data. In the second part of our article we ask, in which way integrated omics data could be used for better personalized patient treatment in a supervised as well as unsupervised learning setting. Different classes of algorithms are discussed for both application tasks. Existing and future challenges for data integration methods are pointed out.
منابع مشابه
Techniques for integrating ‐omics data
The challenge for -omics research is to tackle the problem of fragmentation of knowledge by integrating several sources of heterogeneous information into a coherent entity. It is widely recognized that successful data integration is one of the keys to improve productivity for stored data. Through proper data integration tools and algorithms, researchers may correlate relationships that enable t...
متن کاملStatistical Learning Methods for High Dimensional Genomic Data Statistical Learning Methods for High Dimensional Genomic Data Title: Statistical Learning Methods for High Dimensional Genomic Data
Due to their high-dimensionality, -omics technologies require the development of computational methods that are able to work with large number of variables. Each data type is characterized by its method of measurement and by the biological aspect under study. Understanding the data properties allows the design of sophisticated and effective computational models that are able to uncover and expl...
متن کاملHow To Use CORREP to Estimate Multivariate Correlation and Statistical Inference Procedures
OMICS data are increasingly available to biomedical researchers, and (biological) replications are more and more affordable for gene microarray experiments or proteomics experiments. The functional relationship between a pair of genes or proteins are often inferred by calculating correlation coefficient between their expression profiles. Classical correlation estimation techniques, such as Pear...
متن کاملElementary: Large-Scale Knowledge-Base Construction via Machine Learning and Statistical Inference
Researchers have approached knowledge-base construction (KBC) with a wide range of data resources and techniques. We present Elementary, a prototype KBC system that is able to combine diverse resources and different KBC techniques via machine learning and statistical inference to construct knowledge bases. Using Elementary, we have implemented a solution to the TAC-KBP challenge with quality co...
متن کاملApplication of ensemble learning techniques to model the atmospheric concentration of SO2
In view of pollution prediction modeling, the study adopts homogenous (random forest, bagging, and additive regression) and heterogeneous (voting) ensemble classifiers to predict the atmospheric concentration of Sulphur dioxide. For model validation, results were compared against widely known single base classifiers such as support vector machine, multilayer perceptron, linear regression and re...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016